Method for reconstructing geostationary ocean color satellite data based on data interpolating empirical orthogonal functions

ABSTRACT

The present disclosure discloses a method for reconstructing geostationary ocean color satellite data based on Data INterpolating Empirical Orthogonal Functions. The method includes steps of: 1) for an original ocean color remote sensing area, using concentration detection and edge detection to eliminate abnormal image elements; 2) in the Data INterpolating Empirical Orthogonal Functions, using Laplace smoothing filter to smoothly filter a time covariance matrix; 3) for temporal modes value obtained by the Data INterpolating Empirical Orthogonal Functions, first decomposing it into sub-temporal mode, then adopting empirical modal decomposition, and finally performing interpolation, to obtain an ocean color remote sensing reanalysis data set that maintains its original accuracy and full coverage of time and space. Advantage of the method is that it fully considers high time resolution characteristics of the geostationary ocean color satellite data and considers conditions that there is a continuous high missing rate and complete missing in data missing, and it is suitable for the reconstruction of the geostationary ocean color satellite data, with good accuracy and high interpretation rate. The present disclosure has very important practical application value for the reconstruction of the geostationary ocean color satellite data.

TECHNICAL FIELD

The present disclosure relates to the field of ocean remote sensing and,in particular, to a method for reconstructing geostationary ocean colorsatellite data based on Data INterpolating Empirical OrthogonalFunctions.

BACKGROUND

Ocean remote sensing satellites can carry out large-scale and long-termobservations of the global ocean, to provide human beings with a datasource that cannot be replaced by other observation methods, forunderstanding the ocean and knowing the ocean. Emergence ofgeostationary ocean color satellites provides a new perspective forocean remote sensing research—monitoring changes of ocean color fromtime to time. The first geostationary ocean color sensor—GOCI(Geostationary Ocean Color Imager) in the world provides, for the firsttime, ocean color data having time resolution of up to an hour, makingit possible to monitor changes in the ocean environment over time.However, affected by clouds, fog and haze over the ocean, there are manycases of continuous high missing rate or even complete missing of data,which has a serious impact on temporal and spatial continuity of data,such that usability of ocean color products obtained by inversion isgreatly reduced, and its due value cannot be fully utilized.

The data missing is a common problem in ocean remote sensing data, andin order to complement missing remote sensing data, a variety of dataprocessing methods have been developed at home and abroad for datareconstruction, such as optimal interpolation method, kriginginterpolation method, Data INterpolating Empirical Orthogonal Functions,etc., among which, the Data INterpolating Empirical Orthogonal Functions(DINEOF) is a relatively effective method for reconstructing the oceanremote sensing data and has been used to reconstruct a variety of theocean remote sensing data, such as ocean surface temperature,chlorophyll a concentration, total suspended matter concentration, oceansurface salinity, etc. Due to its own limitations, the DataINterpolating Empirical Orthogonal Functions has many adaptabilityproblems for reconstructing the geostationary ocean color satellitedata. Reconstruction research of Liu Xiaoming and Wang Menghua has ashorter time span than quarters, and it focuses on time sections havingrelatively low data missing rates in summer and autumn.

SUMMARY

An object of the present disclosure is to overcome the existingshortcomings and provide a method for reconstructing geostationary oceancolor satellite data based on the Data INterpolating EmpiricalOrthogonal Functions.

To achieve the object of the present disclosure, technical solutionsprovided are as follows.

Provided is a reconstruction method for geostationary ocean colorsatellite data based on Data INterpolating Empirical OrthogonalFunctions, and as the reconstruction method includes steps of:

S1: performing preprocessing for each geostationary ocean colorsatellite remote sensing image in an original ocean color remote sensingdata set to be reconstructed, and setting abnormal image elements asdata missing points, and eliminating, from the data set,extremely-high-missing-rate data having a missing rate higher than anupper limit in time or space;

S2: constructing a two-dimensional data matrix based on a time-spacefield of preprocessed ocean color remote sensing data set, performingreconstruction by the Data INterpolating Empirical Orthogonal Functions,and, before each iteration decomposition, filtering a time covariancematrix using Laplace smoothing filter; and after the reconstruction,obtaining an optimal retained mode number, and corresponding temporalmodes and spatial modes;

S3: decomposing the temporal modes into characteristic mode vectorsaccording to columns, each column of the characteristic mode vectors isdecomposed into a plurality of sub-temporal mode according to time towhich the data belongs, then performing empirical modal decomposition oneach of the plurality of sub-temporal mode, using a cubic splineinterpolation method to complement data at positions corresponding tothe extremely-high-missing-rate data in inherent mode functioncomponents and residual after decomposition, and re-superimposing andmerging to obtain complete temporal modes having a continuous timeseries;

S4: using the spatial modes obtained in the step S2 and the completetemporal modes obtained in the step S3, to reconstruct to obtain ageostationary ocean color satellite data set with full coverage in timeand space.

Based on the above technical solution, each step is preferablyimplemented in the following specific ways. The preferred implementationmanners of each step can be combined correspondingly without conflict,which does not constitute a limitation.

Preferably, in the step S1, for each geostationary ocean color satelliteremote sensing image, the abnormal image elements are identified andprocessed according to steps of S11 to S15:

S11: for each geostationary ocean color satellite remote sensing imagein an original ocean color remote sensing data set to be reconstructed,setting a space window having a side length of L by respectively takingeach valued image element in the image as a center;

S12: for each valued image element, calculating concentration O_(conc)of the image element by taking all valued image elements in the spacewindow corresponding to the image element as a sample data set O, and:

$O_{conc} = {❘\frac{O_{i} - O_{m}}{\delta}❘}$

where O_(i) is a current image element value of the concentration to becalculated, O_(m) is the median of the sample data set O; and a standarddeviation δ is: δ=k*MAD(O), where k is a scale factor constant, andMAD(O) is a median absolute deviation of the sample data set O;

S13: for each valued image element respectively, calculating an edgedetection value O_(prox) of the image element, and O_(prox) being apositive value if a valueless image element exists around the imageelement, or O_(prox) being zero if no valueless image element existsaround this image element;

S14: for each valued image element respectively, calculating an anomalydetection result O_(final) of the image element:

O _(final) =w _(prox) O _(prox) +w _(conc) O _(conc)

where w_(prox) is an edge detection weight, w_(conc) is a concentrationdetection weight, and it is satisfied that w_(prox)+w_(conc)=1; and

S15: according to a preset threshold O_(threshold) of abnormal imageelement detection, taking the image element having the O_(final) largerthan the threshold O_(thresold) as the abnormal image element, andsetting the abnormal image element as the data missing point.

Preferably, in the step S1, eliminating the extremely-high-missing-ratedata includes:

in terms of time, eliminating an image corresponding to the moment if amissing rate of the geostationary ocean color satellite remote sensingimage at any moment is larger than an upper limit of the missing rate;and in terms of space, eliminating the image element values of theposition point in all images if a missing rate of the time series dataat any position point in all geostationary ocean color satellite remotesensing images is larger than the upper limit of the missing rate.

Further preferably, the upper limit of the missing rate is 95%.

Preferably, the step S2 includes steps of:

S21: setting the time-space field of the preprocessed ocean color remotesensing data set as an m×n two-dimensional data matrix X⁰, where m isthe number of the spatial position points in the geostationary oceancolor satellite remote sensing image, n is the total number of momentsrepresented by the images in the ocean color remote sensing data set,and m>n, wherein for the matrix X⁰, the m rows respectively representthe time series values of m different spatial position points, and the ncolumns respectively represent the values of all the spatial positionpoints at n different moments;

S22: calculating an average X⁰ of all valid data in the matrix X⁰, andletting X=X⁰−X⁰ ; and randomly selecting a part of the valid data fromthe matrix X as a cross-validation set X^(c), and setting values of allthe missing points of the data in the matrix X and the elementsextracted into the cross-validation set X^(c) to be zero;

S23: iterating on the number P of the retained modes, calculating thespatial modes U_(P), the temporal modes V_(P), and a root mean squareerror R between the reconstructed value and the original value of thecross-validation set under different P values; wherein steps S231 toS234 are executed in each iteration:

S231: performing a singular value decomposition on the matrix X,calculating P largest singular values and singular vectors througheigenvector decomposition, then obtaining an m×P spatial modes U_(P),P×P singular value matrix S_(P), and n×P temporal modes V_(P), and acalculation equation thereof is:

X^(T)Xv_(i) = ρ_(i)²v_(i) $u_{i} = \frac{{Xv}_{i}}{\rho_{i}}$

where u_(i) is an i-th column o the spatial modes U_(P), v_(i) is ani-th column of the temporal modes V_(P), ρ_(i) is a singular valuecorresponding to u_(i) and v_(i), i=1, . . . , P; and X^(T)X is an n×ntime covariance matrix;

S232: using the Laplace smoothing filter to first filter columns of thetime covariance matrix, and then filter rows of the time covariancematrix; wherein for the rows or columns of the time covariance matrix,each is regarded as a time series data set C={c₁, c₂, . . . , c_(n)}including data of n moments, and a time set corresponding to C is T={t₁,t₂, . . . , t_(n)} c_(I) is time series data of a l-th time t_(l),l=1,2, . . . , n, and it is satisfied that t₁<t₂< . . . <t_(n); and aformula for h Laplace smoothing filter is

$c_{l}^{({s + 1})} = {c_{l}^{(s)} + \frac{w_{l + 1} - w_{l}}{t_{l + 1}^{\prime} - t_{l}^{\prime}}}$$w_{l} = {\alpha\frac{c_{l} - c_{l - 1}}{t_{l} - t_{l - 1}}}$t_(l)^(′) = t_(l − 1) + t_(l)

where, c_(l) ^((s)) represents data c_(l) before the Laplace smoothingfilter, c_(l) ^((s+1)) represents data c_(l) after the Laplace smoothingfilter; and a is a smoothing strength, 0<α<1;

S233: after completing the Laplace smoothing filter once, calculatingreconstructed values of all data missing points in the matrix X, whereina reconstructed value X_(i,j) ^(re) of a missing point located in thei-th row and j-th column of the matrix X is:

$X_{i,j}^{re} = {\left( {U_{P}S_{P}V_{P}^{T}} \right)_{i,j} = {\sum\limits_{p = 1}^{P}{{\rho_{p}\left( u_{p} \right)}_{i}\left( v_{p}^{T} \right)_{j}}}}$

where: (u_(p))_(i) is an i-th element in a p-th column of the spatialmodes U_(P), (v_(p) ^(T))_(j) is a j-th element in a p-th column of thetemporal modes V_(P), and ρ_(p) is a corresponding singular value; and

S234: calculating a root mean square error R between the reconstructedvalue and the true value of all the elements in the cross-validationset:

$R = \sqrt{\frac{1}{N}{\sum_{p = 1}^{N}\left( {X_{p}^{re} - X_{p}^{c}} \right)^{2}}}$

where N is the total number of data in the cross-validation set; X_(p)^(re) and X_(p) ^(c) are the reconstructed value and true value of thep-th element in the cross-validation set, respectively;

wherein if the root mean square error R is larger than an errorthreshold, the reconstructed values of all missing points in the matrixX obtained in the step S233 are updated to the matrix X, then the stepsS231 to S233 are repeated for the updated matrix X, until the root meansquare error R converges below the threshold, then the repetition stops,and the converged root mean square error R is recorded as the root meansquare error R^(P) under the current retained mode number P; and

S24: comparing the root mean square error R^(P) under different retainedmode numbers P=1, 2, . . . , k_(Max), taking the P value correspondingto the smallest R^(P) as an optimal retained mode number P, to obtainoptimal temporal modes V_(P) and optimal spatial modes U _(P) .

Preferably, in the step S23, an iteration range of the retained modenumber P is [1,k_(MAX)], and the maximum number of iterationsk_(MAX)≤min(m,n).

Further preferably, in the step S23, if the number of repetitions of thesteps S231 to S233 for the singular value decomposition and the Laplacesmoothing filter reaches the maximum number of times, the repetitionstops, and a final root mean square error R is directly recorded as theroot mean square error R^(P) under the current retained mode number P.

Preferably, the step S3 includes steps of:

S31: decomposing each column of characteristic mode vectors in theoptimal temporal modes V _(P) , and when decomposing, decomposing databelonging to a same time in different days into one sub-temporal modeswith a daily cycle, and the i-th column characteristic mode vectorv_(i)(t) is decomposed into:

v _(i)(t)=(v _(i1)(d1),v _(i2)(dk), . . . , v _(iK)(dK))

where: K is the total number of daily data frequencies of thegeostationary ocean color satellite remote sensing image; v_(ik)(dk) isthe sub-temporal modes corresponding to the k-th time in v_(i)(t), andincludes data of the k-th time of all dates in v_(i)(t); i=1,2, . . . ,P; k=1,2, . . . , K; dk represents the dates included in the data inv_(ik)(dk), and a missing date corresponding to theextremely-high-missing-rate data is not included in dk;

S32: performing empirical decomposition for each sub-temporal modesv_(ik)(dk) in each column of the characteristic mode vectors:

${v_{ik}({dk})} = {{\sum\limits_{j = 1}^{n}{c_{j}({dk})}} + {r_{n}({dk})}}$

where n is the total number of moments represented by the images in theocean color remote sensing data set, and c_(j)(dk) is the j-th inherentmode function component, and r_(n)(dk) is residual of the decomposition;

S33: for each sub-temporal modes v_(ik)(dk), using a cubic splineinterpolation method to interpolate and fill all the inherent modefunction components c_(j)(dk) and the data missing date positions in theresidual signal r_(n)(dk), then re-superimposing all the inherent modefunction components and residual signals after the interpolation andfilling, to obtain the sub-temporal modes v_(ik)(Dk) having a completetime series; wherein Dk includes all consecutive dates having the k-thtime;

S34: re-superimposing and merging the K sub-temporal mode v_(ik)(Dk) ofa same column of the characteristic mode vectors, to obtain the completecharacteristic mode vector v_(i)(T) including all moments, and Trepresents a moment corresponding to the extremely-high-missing-ratedata; and

S35: re-superimposing and merging all P columns of the completecharacteristic mode vectors v_(i)(T), to obtain the complete temporalmodes V_(T) having a continuous time series.

Further preferably, the total number K of the daily data frequencies ofthe geostationary ocean color satellite remote sensing image is 8.

Preferably, the step S4 includes steps of:

S41: using the spatial modes U _(P) obtained in the step S2 and thecomplete temporal modes V_(T) obtained in the step S3, to recalculatethe reconstructed values of all missing points in the matrix X, and thereconstructed value X_(i,j) ^(re) of the missing point located in thei-th row and j-th column of the matrix X is:

$X_{i,j}^{re} = {\left( {U_{\overset{\_}{P}}S_{\overset{\_}{P}}V_{T}^{T}} \right)_{i,j} = {\sum\limits_{p = 1}^{\overset{\_}{P}}{{{\overset{\_}{\rho}}_{p}\left( {\overset{\_}{u}}_{p} \right)}_{i}\left( {\overset{\_}{v}}_{p}^{T} \right)_{j}}}}$

where (ū_(p))_(i) is the i-th element in the p-th column of the spatialmodes U _(P) , (v _(p) ^(T))_(j) is the j-th element in the p-th columnof the temporal modes V_(T), ρ _(p) is a corresponding singular value;and

S42: updating the reconstructed values of all missing points in thematrix X obtained in the step S41 to the matrix X, and keeping values ofother elements of the matrix X unchanged, to obtain the time-space fieldof the ocean color remote sensing data set with the missing values beingcomplemented, so as to obtain a geostationary ocean color satellite dataset fully covering time and space.

Compared with the related art, the present disclosure has beneficialeffects.

In view of the limitations of the DINEOF method, the present disclosurefully considers its characteristics of high time resolution andconsiders conditions that there is a continuous high missing rate andcomplete missing in data missing, first uses the concentration detectionand the edge detection to check the abnormal image elements in theremote sensing data, to avoid interference of outliers oncross-validation in the DINEOF method, then introduces the Laplacesmoothing filter to filter the time covariance matrix in the DINEOFdecomposition, to reduce data fault caused by night intervals andcontinuous data missing, then performs second decomposition on thetemporal modes obtained by reconstruction and then interpolates, torealize the reconstruction of high missing rate data and continuousmissing data, so as to finally obtain an ocean color remote sensingreanalysis data set that maintains its original accuracy and fullcoverage of time and space. The present disclosure improves the numberof the retained modes (Mode), reconstruction accuracy (MSE, RMSE, MAE),signal interpretation rate and reconstruction rate, and the operation issimple and easy, and the reconstruction result is intuitive. The presentdisclosure provides demonstration and reference for reconstruction ofother remote sensing data with high time resolution characteristics orocean color satellite data with large time interval changes.

BRIEF DESCRIPTION OF DRAWINGS

FIG. 1 is a flow chart of a method for reconstructing geostationaryocean color satellite data based on Data INterpolating EmpiricalOrthogonal Functions;

FIG. 2 shows comparison between original and reconstructed images of aGOCI total suspended matter concentration on Jun. 2, 2017.

FIG. 3 shows reconstruction accuracy results of a GOCI total suspendedmatter concentration on Jun. 2, 2017.

DESCRIPTION OF EMBODIMENTS

The present disclosure will be further elaborated and illustrated belowin conjunction with the accompany drawings and specific embodiments.

As shown in FIG. 1 , it is a flowchart of a method for reconstructinggeostationary ocean color satellite data based on the Data INterpolatingEmpirical Orthogonal Functions provided in a preferred embodiment of thepresent disclosure. Main steps thereof include 4 steps, which are S1 toS4, respectively:

S1: performing preprocessing for each geostationary ocean colorsatellite remote sensing image (GOCI image) in an original ocean colorremote sensing data set to be reconstructed, to identify abnormal imageelements, and setting the abnormal image elements as data missingpoints, and eliminating, from the data set, extremely-high-missing-ratedata having a missing rate higher than an upper limit in time or space.

S2: constructing a two-dimensional data matrix based on a time-spacefield of preprocessed ocean color remote sensing data set, performingreconstruction by the Data INterpolating Empirical Orthogonal Functions,and, before each iteration decomposition, using Laplace smoothing filterto filter a time covariance matrix; after the reconstruction, obtainingan optimal retained mode number, and corresponding temporal modes andspatial modes.

S3: decomposing the temporal modes obtained in S2 into characteristicmode vectors according to columns, each column of the characteristicmode vectors is decomposed into a plurality of sub-temporal modeaccording to time to which the data belongs, then performing empiricalmodal decomposition on each of the sub-temporal modes, using a cubicspline interpolation method to complement the data at positionscorresponding to the extremely-high-missing-rate data in the inherentmode function components and the residual after decomposition,re-superimposing and merging to obtain complete temporal modes having acontinuous time senes.

S4: using the spatial modes obtained in S2 and the complete temporalmodes obtained in S3, to reconstruct to obtain the geostationary oceancolor satellite data set with full coverage in time and space.

The specific implementation manners and effects of the S1 to S4 in thisembodiment will be described in detail below.

First of all, although the GOCI data has undergone data quality control,there may still be outliers within a reasonable range. Therefore, one ofpurposes of the step S1 is to process these abnormal image elements, toprevent noise from being introduced during the reconstruction. The mostcommon method of concentration detection is to judge whether data is“outlier” based on an average and a standard deviation of all the data.This method is relatively simple to calculate, but if there are a largenumber of outliers in the data set to be detected, it will greatlyaffect description of overall characteristics of the data by the averageand the standard deviation, thereby resulting in greater uncertainty inoutlier detection. In order to avoid the above situation, using a medianand a median absolute deviation (MAD) to replace the average and thestandard deviation is a more robust detection method.

In this embodiment, for each geostationary ocean color satellite remotesensing image, in the preprocessing process, weighted results of theconcentration detection and edge detection are used as a standard, andthe abnormal image elements are identified and processed according toS11 to S15:

S11: for each geostationary ocean color satellite remote sensing imagein the original ocean color remote sensing data set to be reconstructed,setting a space window having a side length of L by respectively takingeach valued image element in the image as a center, and L can beadjusted according to actual situations.

S12: for each valued image element, taking all the valued image elementsin the space window corresponding to this image element as a sample dataset O, and calculating concentration O_(conc) of the image element:

$O_{conc} = {❘\frac{O_{i} - O_{m}}{\delta}❘}$

where O_(i) is a current image element value of the concentration to becalculated, O_(m) is the median of the sample data set O; a standarddeviation δ is: δ=k*MAD(O), k is a scale factor constant, and MAD(O) isa median absolute deviation of the sample data set O.

For normally distributed data, it is assumed that a normal value fallsin a middle 50% area, then the outliers fall in a 50% area on bothsides, and it can be obtained by converting that k=1.4826. According tothe 3σ standard, image elements with O_(conc) larger than 3 can beregarded as abnormal image elements.

S13: to enhance the stability of monitoring of the abnormal imageelements, the present disclosure uses a simple edge detection method tomark image elements at an edge of a cloud, based on causes of theabnormal image elements, that is, the probability of the abnormal imageelements occurring at the edge of the cloud is higher: for each valuedimage element respectively, calculating an edge detection value O_(prox)of the image element, and if there are valueless image elements in 8image elements around this image element, then O_(prox) takes a positivevalue M, and if there are no valueless image elements in the 8 imageelements around this image element, then O_(prox) will take zero. Inthis embodiment, in order to keep consistent in value with the result ofthe concentration detection, if there are valueless image elementsaround the image element, then it is marked that M=3, and if no, then itis marked as 0, that is:

$O_{prox} = \left\{ \begin{matrix}{3,} & {{there}{are}{valueless}{image}{elements}{around}} \\{0,} & {{There}{is}{no}{valueless}{image}{elements}{around}}\end{matrix} \right.$

S14: for each valued image element respectively, calculating an anomalydetection result O_(final) of the image element:

O _(final) =w _(prox) O _(prox) +w _(conc) O _(conc)

where w_(prox) is edge detection weight, w_(conc) is concentrationdetection weight, and it is satisfied that w_(prox)+w_(conc)=1. w_(prox)and w_(conc) can be adjusted according to actual conditions.

Using the weighted result to detect abnormal image elements, on the onehand, improves the stability of the detection, to avoid excessivepunishment by a single method, which leads to more data missing and lossof the original change characteristics of the data, and on the otherhand, it is also possible to adjust the weight according to sources andcharacteristics of the abnormal image elements in given original data,to improve the effect of the outlier detection.

S15: according to a preset threshold O_(threshold) of the abnormal imageelement detection, taking image elements having the O_(final) largerthan the threshold O_(threshold) as the abnormal image elements, andsetting the abnormal image elements as the data missing points. Amagnitude of the threshold O_(threshold) determines the number ofabnormal image elements that are eliminated, and in this embodiment, thedetection threshold O_(threshold) is set to 3.

In addition, if the original data has a high missing rate in the time orspace dimension (usually more than 95%), then these data not only cannotprovide effective information, but may affect a final result. Therefore,the Data INterpolating Empirical Orthogonal Functions (hereinafterreferred to as DINEOF) cannot reconstruct theseextremely-high-missing-rate data and needs to eliminate before thereconstruction, otherwise, the reconstruction result of the image havinga high missing rate is replaced by the average of the original remotesensing data set.

In this embodiment, a method for eliminating theextremely-high-missing-rate data is: in terms of time, if the missingrate of the geostationary ocean color satellite remote sensing image atany moment is larger than an upper limit of the missing rate, then animage corresponding to that moment will be eliminated; in terms ofspace, if a missing rate of the time series data at any position pointin all geostationary ocean color satellite remote sensing images islarger than the upper limit of the missing rate, then the image elementvalues of the position point in all images will be eliminated, and theeliminated image element is not regarded as the data missing point. Theupper limit of the missing rate can generally be set to 95%.

The original data reconstructed by the DINEOF method should remaincontinuous in time, and time intervals of the data are the same. For thegeostationary ocean color satellite data, the data itself is continuouswithin the same day and the time intervals are consistent, while afterhigh-frequency daytime data, there is a nighttime gap with no data for along time, and the nighttime gap usually exceeds 12 hours and is longerin winter. On the other hand, since cover time of the cloud is usuallylonger than a monitoring period of the geostationary ocean colorsatellite, there are many cases where the data has a continuous highmissing rate or even complete missing in local data, which furtherreduces the continuity of the data. In data having poor continuity,there will be some characteristics having abnormal changes in the data,but these characteristics do not represent an entire spatial-temporalvariability of the data, resulting in that the DINEOF method overfits asmall amount of the original data and the abnormal changes, so that thereconstructed image does not necessarily reflect due spatial andtemporal changes of the given image and causes the reconstruction datato have obvious faults, and there are very unreasonable continuous timeimages occurring. The present disclosure considers a filter to smooththe original remote sensing data, and this filter, unlike generalsmoothing methods (such as a median filter, a mean filter, etc.), needsto retain original change characteristics of the data while eliminatingmutations and noise. In this regard, the present disclosure introducesthe Laplace smoothing method to construct a filter suitable forsmoothing the time series data and apply it to the reconstructionprocess of DINEOF. The following first introduces a theory of afiltering process.

The Laplace filter is applied to m time series of a data matrix X (m×n),and it can be understood as extending the Laplace smoothing filter (n×1)acting on one-dimensional time series to two-dimensional, a constructedLaplace filter L is an n×n matrix, and the matrix X is filtered in atime domain:

{tilde over (X)}=XL

where, {tilde over (X)} is an original matrix after the Laplacesmoothing filter. Theoretically, this filter can directly filter X, butprocessing and computation of the original matrix is too large,requiring calculations of m times. Considering that in singular valuedecomposition (SVD) calculations of {tilde over (X)}, only the first fewtemporal modes and spatial modes are necessary, and P largest singularvalues and singular vectors are calculated by characteristics vectordecomposition, with an equation being:

{tilde over (X)} ^(T) {tilde over (X)}v _(i)=ρ_(i) ² v _(i)

where, v_(i) is the i-th column of the temporal modes V, ρ_(i) is thecorresponding singular value, and i=1, . . . , P. Since {tilde over(X)}^(T){tilde over (X)} is an n×n matrix, it is defined as a timecovariance matrix A. The filter L can be combined with the SVDdecomposition of {tilde over (X)}, to change filtering of the originalmatrix to filtering of the time covariance matrix. An eigenvectordecomposition expression is performed on the matrix {tilde over (X)}filtered by the Laplace smoothing filter, and it can be seen that thetime covariance matrix after the Laplace smoothing filter is:

Ã=L ^(T) AL

According to the above formula, the Laplace filter L filters columns ofthe time covariance matrix A first, and then filters rows of the timecovariance matrix A. Thus, while greatly reducing the computation (fromm times to 2n times), sensitivity to the missing data is also reduced(there are no missing values in the time covariance matrix). Optimal αand P values can be determined through a root mean square error of theDINEOF decomposition. Generally, the value of α is kept constant and thevalue of P is optimized.

Based on the foregoing theory, the reconstruction process by the DataINterpolating Empirical Orthogonal Functions in this embodiment ismainly implemented through the step S2, and a specific method thereof isdescribed in detail below:

S21: setting the time-space field of the preprocessed ocean color remotesensing data set as an m×n two-dimensional data matrix X⁰, where m isthe number of the spatial position points in the geostationary oceancolor satellite remote sensing image, it is the total number of momentsrepresented by the images in the ocean color remote sensing data set,and m>n, each row of the matrix X⁰ respectively represents a time seriescomposed of pixel values of one spatial position point at differentmoments, and each column respectively represents values of all spatialposition points at a moment, that is, the pixel values on a GOCI imageat a certain moment. Therefore, the m rows of the matrix X⁰ respectivelyrepresent the time series values of the m different spatial positionpoints, and the it columns respectively represent the values of all thespatial position points at n different moments. I is denoted as a set ofthe missing data points (including the original missing ones and theabnormal image elements identified in the S1), the missing values in thematrix are represented by NaN, and P is denoted as the number ofretained modes, P≤min(m,n).

S22: calculating an average X⁰ of all valid data in the matrix X⁰, andletting X=X⁰−X⁰ , which means subtracting the average X⁰ from the valuesof all valued elements in X⁰; a part of the valid data is randomlyselected from the matrix X as a cross-validation set X^(c), which isgenerally 1% of the total amount of the valid data, as a criterion forjudging the optimal number of the retained modalities. A point in X atthe same position in the cross-validation set is also regarded as amissing point, and a value assigned thereto is NaN. A value of 0 isassigned to all the missing points of the data in the matrix X and theelements extracted into the cross-validation set X^(c), that is, toreplace all the points in X having the value of NaN with 0.

S23: iterating on the number P of the retained modes, calculating thespatial modes U_(P), the temporal modes V_(P), and the root mean squareerror R between the reconstructed value and the original value of thecross-validation set under different P values. P is a positive naturalnumber, and within a value range of [1,k_(MAX)], initializing P=1 beforestarting iteration, and then P=P+1 after each iteration, and the maximumnumber of the iterations k_(MAX)≤min(m,n).

Steps S231 to S234 are executed in each iteration:

S231: performing the singular value decomposition on the matrix X,calculating the P largest singular values and the singular vectorsthrough the eigenvector decomposition, then obtaining the m×P spatialmodes U_(P), P×P singular value matrix S_(p), and n×P temporal modesV_(P), and a calculation equation is:

${{X^{T}{Xv}_{i}} = {\rho_{i}^{2}v_{i}}}{u_{i} = \frac{{Xv}_{i}}{\rho_{i}}}$

where u_(i) is the i-th column of the spatial modes U_(P), v_(i) is thei-th column of the temporal modes V_(P), ρ_(i) is the singular valuecorresponding to u_(i) and v_(i), i=1, . . . , P; X^(T)X is the n×n timecovariance matrix; a superscript T represents transpose (the samebelow).

Thus, the spatial modes U_(P), the temporal modes V_(P), and thesingular value matrix S_(p) under the current retained mode number P canbe obtained for subsequent call.

S232: using the Laplace smoothing filter to filter the columns of thetime covariance matrix first, and then filter the rows of the timecovariance matrix. When filtering the rows or columns of the timecovariance matrix, each of them is regarded as a time series data setC={C₁, c₂, . . . , c_(n)} containing data of n moments, and the time setcorresponding to C is T={t₁, t₂, . . . , t_(n)}, c_(l) is time seriesdata of the l-th time t₁, l=1,2, . . . , n, and it is satisfied thatt₁<t₂< . . . <t_(n). A formula for the Laplace smoothing filter is:

${c_{l}^{({s + 1})} = {c_{l}^{(s)} + \frac{w_{l + 1} - w_{l}}{t_{l + 1}^{\prime} - t_{l}^{\prime}}}}{w_{l} = {\alpha\frac{c_{l} - c_{l - 1}}{t_{l} - t_{l - 1}}}}{t_{l}^{\prime} = {t_{l - 1} + t_{l}}}$

where c_(l) ^((s)) represents data c_(l) before the Laplace smoothingfilter, and c_(l) ^((s+1)) represents data c_(i) after the Laplacesmoothing filter; a is smoothing strength, 0<α<1. A superscript srepresents the s-th Laplace smoothing filter, and the number s of theLaplace smoothing filter is 1, 2, . . . s_(max). An initial value of sis 1, and s=s+1 after performing the filtering once.

The smoothing effect of the filtering depends on the values of a and themaximum number of iterations s_(max). This filtering realizes thesmoothing of the time series data while retaining the changecharacteristics of the time series, and it considers an influence oftime relativity.

S233: after completing the Laplace smoothing filter once, calculatingreconstructed values of all the data missing points in the matrix X. Ifan element in the i-th row and j-th column of the matrix X in the matrixX is a missing point, then the reconstructed value X_(i,j) ^(re) of themissing point is:

$X_{i,j}^{re} = {\left( {U_{P}S_{P}V_{P}^{T}} \right)_{i,j} = {\sum\limits_{p = 1}^{P}{{\rho_{p}\left( u_{p} \right)}_{i}\left( v_{p}^{T} \right)_{j}}}}$

where: i and j are space and time subscripts of X, (u_(p))_(i) is thei-th element in the p-th column of the spatial modes U_(P), (v_(p)^(T))_(j) is the j-th element in the p-th column of the temporal modesV_(P), and ρ_(p) is the corresponding singular value.

Thus, the reconstructed values of all data missing points in the matrixX can be obtained, and since a part of the data is taken from the matrixX in the previous step as the cross-validation set, the error can becalculated after filtering based on a true value of the cross-validationset and a predicted reconstructed value.

S234: calculating the root mean square error R between the reconstructedvalue and the true value of all the elements in the cross-validationset:

$R = \sqrt{\frac{1}{N}{\sum_{p = 1}^{N}\left( {X_{p}^{re} - X_{p}^{c}} \right)^{2}}}$

where N is the total number of data in the cross-validation set; X_(p)^(re) and X_(p) ^(c) are the reconstructed value and true value of thep-th element in the cross-validation set, respectively;

if the root mean square error R is larger than an error threshold, thereconstructed values of all the missing points in the matrix X obtainedin S233 are updated to the matrix X, and in this update process, onlythe data missing points are updated, and other positions remainunchanged. Therefore, the updated matrix X^(re) can be expressed as:

X ^(re) =X+∂X

where ∂X is a matrix composed of the reconstructed values of the missingpoints.

Then the updated X^(re) is used as the matrix X again, and repeating thesteps S231 to S233. The root mean square error R needs to be calculatedfor every repetition, until the root mean square error R converges belowthe threshold, then the repetition stops, and the converged root meansquare error R is recorded as the root mean square error R^(P) under thecurrent retained mode number P. However, considering a calculationefficiency, it is necessary to set an upper limit of the number ofrepetitions, that is, the aforementioned s_(max). If the number ofrepetitions of the steps S231 to S233 for the singular valuedecomposition and the Laplace smoothing filter reaches the maximumnumber of times s_(max), the repetition will not continue, and a finalroot mean square error R is directly recorded as the root mean squareerror R^(P) under the current retained mode number P.

S24: in the execution of the step S23, the iteration range of theretained mode number P is [1,k_(MAX)], the root mean square error R^(P)under different retained mode numbers P=1, 2, . . . , k_(MAX) can beobtained, and R^(P) are compared, the P value corresponding to thesmallest R^(P) is taken as the optimal retained mode number P, to obtainthe optimal temporal modes V _(P) and the optimal spatial modes U _(P) .If P=a, then the root mean square error R is the smallest, so theoptimal retained mode number is P=a.

After the above S2 step, the optimal retained mode number P can befinally determined, and the optimal singular value matrix S _(P) , thetemporal modes V _(P) and the optimal spatial modes U _(P) can beobtained. As mentioned above, DINEOF relies on statistical informationof the original data to reconstruct, and it cannot reconstructaccurately the data having a very high missing rate larger than 95%, soit needs to be eliminated before reconstruction, and this part ofeliminated data does not participate in the Laplace filtering process ofstep S2 and needs to rely on step S3 to perform interpolation andreconstruction.

The temporal modes reconstructed using the DINEOF method itself reflectsthe change characteristics of the time series of the data set, and thedata of adjacent times have correlation and continuity in temporal modesvalues corresponding to the same mode, thus it is a feasible method tointerpolate the temporal modes to obtain the temporal modes vectorcorresponding to times having extremely high missing rates. But forgeostationary ocean color satellite data, this method has certainshortcomings and cannot be directly used to reconstruct data of timeshaving extremely high missing rates. A main reason is that, for thetemporal modes obtained from the reconstruction of the geostationaryocean color satellite data, continuity on a time resolution scale islimited to the day. That is to say, using the adjacent data on differentdays to interpolate needs to make the temporal modes correspond with thetime resolution, to perform non-equidistant interpolation, and forcharacteristic mode data with periodicity, an effect of interpolationusing traditional methods is poor. On the other hand, if it is coveredby clouds and fog on the current day, the data at all times during theday will be completely continuously missing data. In this case, usingadjacent values to interpolate at intervals of time, it is difficult tomatch an interpolation result with actual situations. In response tothese problems, the present disclosure adopts a method of reducing thetime resolution, uses a characteristic that high-time-resolution remotesensing data still has good periodic characteristics at lowtime-resolution, to propose to first decompose, at a lower timeresolution, the temporal modes obtained by reconstruction, constructmultiple sub-temporal mode having relatively high continuity andconsistent time intervals, then use the empirical modal decompositionmethod to decompose the sub-temporal mode, use the cubic splineinterpolation to fill in missing signals in decomposition results,finally merge them into sub-temporal mode with continuous time andconsistent intervals, so as to obtain new temporal modes that includesall times, and this method decomposes the temporal modes twice and theninterpolates, so the present disclosure calls it “time mode quadraticdecomposition interpolation”. A core of this method, on the one hand, isto use a fact that the data acquisition time of the geostationary oceancolor satellite data is at equal time intervals at low time resolution,that is, data acquired at the same time for multiple consecutive dayshave the same time interval (24 hours), and the data has periodicity. Onthe other hand, the inherent mode function components obtained throughempirical modal decomposition are chosen for interpolation, instead ofdirectly interpolating the sub-temporal modes vector, and this isbecause volatility of the decomposed signal is lower than that of theoriginal signal, making it good for interpolation and filling, and theinherent mode function component contains more details that the originalvector does not have, and higher accuracy can be achieved.

In the present disclosure, for the convenience of presentation, dataobtained at the same time on different dates (for example, 8 o'clock inthe morning on different dates) are referred to as data at the sametime. The data interval of the same time on two adjacent days is a fixed24 h.

In this embodiment, a specific method for performing the “time modequadratic decomposition interpolation” in step S3 is as follows:

S31: decomposing each column of characteristic mode vectors in theoptimal temporal modes V _(P) , and when decomposing, decomposing databelonging to the same time in different days into one sub-temporal modeswith a daily cycle, the i-th column characteristic mode vector v_(i)(t)is decomposed into:

v _(i)(t)={v _(i1)(d1),v _(i2)(dk), . . . ,v _(iK)(dK)}

where: i=1,2, . . . , P; k=1,2, . . . , K. K is the total number ofdaily data frequencies of the geostationary ocean color satellite remotesensing image, the GOCI data used in the present disclosure scans thearea 8 times a day from 8: 30 to 15: 30 Beijing time, to obtain 8 imageswith a time interval of 1 hour, so the value of K is 8. v_(ik)(dk) isthe sub-temporal modes corresponding to the k-th time in v_(i)(t), andit contains data of the k-th time of all dates in v_(i)(t); dkrepresents the dates included in the data in v_(ik)(dk), and because theextremely-high-missing-rate data in the data is eliminated, in thedecomposed sub-temporal modes, dates corresponding to these data aremissing, so it is denoted by dk, that is, missing date corresponding tothe extremely-high-missing-rate data is not included in dk, and t inv_(i)(t) is also used to indicate data missing. Following steps willperform interpolation completion on the data at these missing datepositions.

S32: performing empirical decomposition for each sub-temporal modesv_(ik)(dk) in each column of the characteristic mode vectors:

${v_{ik}({dk})} = {{\sum\limits_{j = 1}^{n}{c_{j}({dk})}} + {r_{n}({dk})}}$

where n is the total number of moments represented by the images in theocean color remote sensing data set, c_(j)(dk) is the j-th inherent modefunction component, and r_(n)(dk) is residual of the decomposition;

S33: for each sub-temporal modes v_(ik)(dk), using the cubic splineinterpolation method to interpolate and fill all the inherent modefunction components c_(j)(dk) and the data missing date positions in theresidual signal r_(n)(dk), then using the previous empiricaldecomposition formula in a reverse direction, to re-superimpose all theinherent mode function components and residual signals after theinterpolation and filling, to obtain the sub-temporal modes v_(ik)(Dk)having the complete time series, and at this time, the data missing dateposition in the sub-temporal modes has been interpolated to fill thecorresponding data, so it is represented by Dk, That is, Dk contains allconsecutive dates of the k-th time.

S34: re-superimposing and merging the K sub-temporal mode v_(ik)(Dk) ofthe same column of the characteristic mode vectors, to obtain thecomplete characteristic mode vector v_(i)(T) including all moments, andT represents a moment corresponding to the extremely-high-missing-ratedata;

S35: sequentially re-superimposing and merging all P columns of thecomplete characteristic mode vectors v_(i)(T) into a matrix form again,to obtain the complete temporal modes V_(T) having a continuous timeseries.

Therefore, after the aforementioned S1 to S3, the complete spatial modesand the complete temporal modes have been obtained, that is, it ispossible to use the DINEOF method to reconstruct a geostationary oceancolor satellite data set with full coverage in time and space. Aspecific reconstruction method of step S4 will be briefly describedbelow:

S41: using the spatial modes U _(P) obtained in S2 and the completetemporal modes V_(T) obtained in S3, to recalculate the reconstructedvalues of all missing points in the matrix X, and the reconstructedvalue X_(i,j) ^(re) of the missing point located in the i-th row andj-th column of the matrix X is:

$X_{i,j}^{re} = {\left( {U_{\overset{\_}{P}}S_{\overset{\_}{P}}V_{T}^{T}} \right)_{i,j} = {\sum\limits_{p = 1}^{\overset{\_}{P}}{{{\overset{\_}{\rho}}_{p}\left( {\overset{\_}{u}}_{p} \right)}_{i}\left( {\overset{\_}{v}}_{p}^{T} \right)_{j}}}}$

where the superscript T all means transposition, which is different from“the moment T corresponding to the data having the extremely highmissing rate”; (ū_(p))_(i) is the i-th element in the p-th column of thespatial modes U _(P) , (v _(p) ^(T))_(j) is the j-th element in the p-thcolumn of the temporal modes V_(T), ρ _(p) is the singular valuecorresponding to (ū_(P))_(i) and (v _(p) ^(T))_(j);

S42: updating the reconstructed values of all missing points in thematrix X obtained in S41 to the matrix X, and in this update process,only the data missing points are updated, while the other positions ofthe matrix X remain unchanged, and a resulting updated matrix is thetime-space field of the ocean color remote sensing data set with themissing values being complemented. Through the updated matrix, it can bere-decomposed and restored, to obtain a geostationary ocean colorsatellite data set fully covering time and space, and all thegeostationary ocean color satellite remote sensing images in the dataset are continuous and complete in time and space.

Hereinafter, based on the above-mentioned embodiment method, it isapplied to a specific example to demonstrate its effect. The specificprocess is as described above and will not be repeated. The followingmainly shows its specific parameter settings and implementation effects.

Application Example

Taking GOCI total suspended matter concentration data of Hangzhou Bay in2017 as an example, the present disclosure will be described in detail,and specific steps are as follows:

1) using GOCI Level-1B data provided by the Korea Ocean Satellite Centerto perform atmospheric correction using an ultraviolet atmosphericcorrection method (UV-AC), inverting the total suspended matterconcentration data using the empirical modal. The GOCI data was invertedthrough a TSM inversion modal, and 2,920 pieces of inversion resultproducts (GOCI TSM) of the total suspended matter (TSM) of Hangzhou Bayin 2017 were obtained. A coverage range of each pieces of the GOCI TSMdata products is 120.55° E to 122.35° E, 29.85° N to 30.95° N, and animage element size is 0.005°×0.005°. An overall missing rate of the 2920pieces of the GOCI TSM data is 63.5%, and in terms of space, a missingrate (missing times) of the time series data of effective positionpoints in a study area is 99.93% for the highest (2918 pieces), and53.94% for the lowest (1575 pieces). In terms of time, a space missingrate in the 2920 pieces of the GOCI TSM data is 1000% for the highestand 0.89% for the lowest, and there are 1124 pieces of data having amissing rate larger than 95%, accounting for 38.49%.

2) According to the aforementioned step S1, for the 2920 pieces of theGOCI total suspended matter concentration data in 2017, setting L=9,that is, a 9×9 space window is used for concentration detection, theconcentration detection weight w_(conc) and the edge detection weightw_(prox) are both ½, the detection threshold O_(threshold) is set to 3,and the original data in the abnormal image elements, which serve as thedata missing points, is removed. Because of the limitation of DINEOF onthe missing rate of original data, original data corresponding to timeswhen the image missing rate is larger than 95% and all pixelscorresponding to position points where the time series missing rate islarger than 95% are further eliminated. The data elimination operationreduces the number of available GOCI TSM data from 2920 to 1794, areduction of 38.56%. Ocean pixels decreased from 39274 to 37152, areduction of 5.4%, and the pixels that are mainly eliminated are mainlyconcentrated in nearshore waters. An overall missing rate ofreconstructed data after elimination is 38.18%. A calculation basis inDINEOF is based on the mean and covariance of the original data, the TSMdata does not show Gaussian distribution, and conversion of the originaldata needs to be performed before DINEOF. Therefore, the presentdisclosure performs logarithmic processing with a base of 10 on theoriginal data.

3) Constructing a two-dimensional data matrix on the time-space field ofthe preprocessed ocean color remote sensing data set according to theaforementioned S2 step, performing reconstruction by the DataINterpolating Empirical Orthogonal Functions, and using Laplacesmoothing filter to filter the time covariance matrix before eachiteration decomposition; after the reconstruction, obtaining the optimalretained mode number and the corresponding temporal modes and spatialmodes. In this embodiment, a is set to 0.01, while s_(max) is set to 30.Finally, when P=169, the root mean square error R is the smallest, sothe optimal retained mode number P=169, and the singular value matrix S_(P) , the temporal modes V _(P) and the optimal spatial modes U _(P) .

4) Decomposing the temporal modes V _(P) into characteristic modevectors in columns according to the step S3, decomposing each column ofthe characteristic mode vectors into 8 sub-temporal mode according tothe time to which the data belongs, then performing empirical modaldecomposition on each of the sub-temporal mode, using the cubic splineinterpolation method to complement the data at positions correspondingto the extremely-high-missing-rate data in the inherent mode functioncomponents and the residual after decomposition, and re-superimposingand merging to obtain the complete temporal modes having a continuoustime series;

5) according to the step S4, using the spatial modes obtained in S2 andthe complete temporal modes obtained in S3, to reconstruct thegeostationary ocean color satellite data set with full coverage in timeand space.

The reconstruction method of the present disclosure is named DINEOF-G,and compared with the original classic DINEOF method (ALVERA-AZCÁRATE A,BARTH A, RIXEN M, et at. Reconstruction of incomplete oceanographic datasets using empirical orthogonal functions: application to the AdriaticSea surface temperature[J]. Ocean Modelling. 2005, 9(4): 325-346.), theretained mode number (Mode), reconstruction accuracy (MSE, RMSE, MAE), asignal interpretation rate and a reconstruction rate have been improved,as shown in Table 1 below. The reconstruction accuracy has beenincreased by 8% (calculated by MSE), and the data reconstruction ratehas been increased from 58% to 94%, an increase of 36%.

TABLE 1 Reconstruction Interpretation Reconstruction method Mode MSERMSE MAE rate rate Classic DINEOF 140 0.0212 0.1457 0.1044 95.01% 58.24%method The present 169 0.0195 0.1395 0.0998 96.82% 94.6% disclosure

In order to further analyze the effect of each step in the DINEOF-Gmethod proposed by the present disclosure and its influence on thereconstruction result, a manner of adjusting the steps and experimentalparameters is adopted, and comparative tests of different practices aredesigned. SPecific schemes and test results are shown in Table 2, andtest parameters in the table indicate the steps performed (including amissing rate parameter).

TABLE 2 Experimental Interpretation Reconstruction No. parameter ModeMSE RMSE MAE rate rate Test 1 Eliminate data missing 137 0.0219 0.14790.1059 95.21% 74.86% 100% DINEOF Test 2 Eliminate data missing 1400.0212 0.1457 0.1044 95.01% 58.24% 95% DINEOF Test 3 Abnormal imageelement 144 0.0204 0.1427 0.1028 95.17% 58.12% detection Eliminate datamissing 95% DINEOF Test 4 Abnormal image element 169 0.0184 0.13550.0977 96.82% 58.12% detection Eliminate data missing 95% Laplacesmoothing filter DINEOF Test 5 Abnormal image element 169 0.0195 0.13950.0998 96.82% 94.6% detection Eliminate data missing 95% Laplacesmoothing filter DINEOF Time mode quadratic decomposition interpolation

The parameters of Test 2 are consistent with those of the classic DMNOF,and the parameters of the Test 5 are consistent with those of theDNEOF-G proposed by the present disclosure. A sequence of the testparameters is consistent with a sequence of operating procedures in thetest. From the specific test parameters and test results analysis, itcan be seen that: when comparing the Test 2 and Test 3, the abnormalimage element detection can improve the accuracy of reconstruction, andalthough some data is lost, it has less impact on the reconstructionrate, that is, the detection of abnormal image elements has a limitedimpact on the spatial and temporal missing rate of the original data.Comparing the Test 3 and Test 4, Laplace smoothing filter furtherimproved the reconstruction accuracy, while significantly increasing thenumber of retained modes, and correspondingly increasing the signalinterpretation rate. It reflects that the Laplace smoothing filter canbetter smooth the data fault caused by the night gap and the continuouslack of data, and it restores some of the characteristics of theoriginal data. Comparing the Test 4 and Test 5, the Test 5, based on thereconstruction results of the Test 4, uses the time mode quadraticdecomposition interpolation to reconstruct the data at times having thespatial missing rate larger than 95%. The reconstruction accuracy of theTest 5 is slightly lower than that of the Test 4, mainly due tointroduction of additional sample observation values (data with amissing rate higher than 95% and lower than 100%), but it is stillbetter than the Test 1 using the same sample, and this also reflects theaccuracy of the time mode quadratic decomposition interpolation resultsis relatively high.

In order to visually demonstrate the reconstruction effect, under theDINEOF-G reconstruction method of the present disclosure, the GOCI TSMdata reconstruction results of Hangzhou Bay for 8 times a day for 365days in 2017 are obtained. FIG. 2 illustrates the original data and thereconstructed data of the TSM on Jun. 2, 2017 (upper eight ones areoriginal images, and lower eight ones are reconstructed images), and theoverall missing rate of the eight original data on that day is 76.68%,of which the date missing rates at 8:30 and 10:30 exceed 95%. It can beseen that when using the time mode quadratic decomposition interpolationto reconstruct discontinuous missing data, although the temporal modescorresponding to the data at 9:30 is not used for interpolation, thedata still has good continuity, and the dynamic characteristics areobvious, which is an effect that the classic DINEOF method cannotachieve using averages to reconstruct the data with high missing rates.For cloud trails that appear in the data at 15:30, the reconstructeddata can also be better filtered. Accuracy evaluation is performed onvalues of the valued image elements in the original data and values ofthe reconstructed image elements in the same area of the reconstructeddata, and evaluation results are as shown in FIG. 3 , original valuesand reconstructed predicted values are concentrated around a y=x line,RSM is 0.0098, RMSE is 0.0989, and MAE is 0.0645, showing thatrespective accuracy indicators of the method of the present disclosureare relatively ideal, making it have very important practicalapplication values for the reconstruction of the geostationary oceancolor satellite data.

The above embodiment is only a preferred solution of the presentdisclosure, but it is not intended to limit the present disclosure.Those of ordinary skill in the related art can make various changes andmodifications without departing from the spirit and scope of the presentdisclosure. Therefore, all technical solutions obtained by equivalentsubstitutions or equivalent transformations fall within the protectionscope of the present disclosure.

What is claimed is:
 1. A reconstruction method for geostationary oceancolor satellite data based on Data INterpolating Empirical OrthogonalFunctions, the reconstruction method comprising steps of: S1: performingpreprocessing for each geostationary ocean color satellite remotesensing image in an original ocean color remote sensing data set to bereconstructed, and setting abnormal image elements as data missingpoints, and eliminating, from the data set, extremely-high-missing-ratedata having a missing rate higher than an upper limit in time or space;S2: constructing a two-dimensional data matrix based on a time-spacefield of preprocessed ocean color remote sensing data set, performingreconstruction by the Data INterpolating Empirical Orthogonal Functions,and, before each iteration decomposition, filtering a time covariancematrix using Laplace smoothing filter; and after the reconstruction,obtaining an optimal retained mode number, and a corresponding temporalmodes and spatial modes; S3: decomposing the temporal modes intocharacteristic mode vectors according to columns, each column of thecharacteristic mode vectors is decomposed into a plurality ofsub-temporal mode according to time to which the data belongs, thenperforming empirical modal decomposition on each of the plurality ofsub-temporal mode, using a cubic spline interpolation method tocomplement data at positions corresponding to theextremely-high-missing-rate data in inherent mode function componentsand residual after decomposition, and re-superimposing and merging toobtain a complete temporal modes having a continuous time series; andS4: using the spatial modes obtained in the step S2 and the completetemporal modes obtained in the step S3, to reconstruct to obtain ageostationary ocean color satellite data set with full coverage in timeand space.
 2. The reconstruction method according to claim 1, wherein inthe step S1, for each geostationary ocean color satellite remote sensingimage, the abnormal image elements are identified and processedaccording to steps of S11 to S15: S11: for each geostationary oceancolor satellite remote sensing image in an original ocean color remotesensing data set to be reconstructed, setting a space window having aside length of L by respectively taking each valued image element in theimage as a center; S12: for each valued image element, calculatingconcentration O_(conc) of the image element by taking all valued imageelements in the space window corresponding to the image element as asample data set O: $O_{conc} = {❘\frac{O_{i} - O_{m}}{\delta}❘}$ whereO_(i) is a current image element value of the concentration to becalculated, O_(m) is a median of the sample data set O, and a standarddeviation δ is δ=k*MAD(O), where k is a scale factor constant, andMAD(O) is a median absolute deviation of the sample data set O; S13: foreach valued image element respectively, calculating an edge detectionvalue O_(prox) of the image element, and O_(prox) being a positive valueif a valueless image element exists around the image element, orO_(prox) being zero if no valueless image element exists around theimage element, then O_(prox) will take zero; S14: for each valued imageelement respectively, calculating an anomaly detection result O_(final)of the image element:O _(final) =w _(prox) O _(prox) +w _(conc) O _(conc) where w_(prox) isan edge detection weight, w_(conc) is a concentration detection weight,and it is satisfied that w_(prox)+w_(conc)=1; and S15: according to apreset threshold O_(threshold) of abnormal image element detection,taking the image element having the O_(final) larger than the thresholdO_(threshold) as the abnormal image element, and setting the abnormalimage element as the data missing point.
 3. The reconstruction methodaccording to claim 1, wherein in the step S1, eliminating theextremely-high-missing-rate data comprises: in terms of time,eliminating an image corresponding to the moment if a missing rate ofthe geostationary ocean color satellite remote sensing image at anymoment is larger than an upper limit of the missing rate; and in termsof space, eliminating the image element values of the position point inall images if a missing rate of the time series data at any positionpoint in all geostationary ocean color satellite remote sensing imagesis larger than the upper limit of the missing rate.
 4. Thereconstruction method according to claim 3, wherein the upper limit ofthe missing rate is 95%.
 5. The reconstruction method according to claim1, wherein the step S2 comprises steps of: S21: setting the time-spacefield of the preprocessed ocean color remote sensing data set as an m×ntwo-dimensional data matrix X⁰, where m is the number of the spatialposition points in the geostationary ocean color satellite remotesensing image, n is the total number of moments represented by the imagein the ocean color remote sensing data set, and m>n, wherein for thematrix X⁰, the m rows respectively represent the time series values of mdifferent spatial position points, and the n columns respectivelyrepresent the values of all the spatial position points at n differentmoments; S22: calculating an average X⁰ of all valid data in the matrixX⁰, and letting X=X⁰−X⁰ ; and randomly selecting a part of the validdata from the matrix X as a cross-validation set X^(c), and settingvalues of all the missing points of the data in the matrix X and theelements extracted into the cross-validation set X^(c) to be zero; S23:iterating on the number P of the retained modes, calculating the spatialmodes U_(P), the temporal modes V_(P), and a root mean square error Rbetween the reconstructed value and the original value of thecross-validation set under different P values; wherein steps S231 toS234 are executed in each iteration: S231: performing a singular valuedecomposition on the matrix X, calculating P largest singular values andsingular vectors through eigenvector decomposition, then obtaining anm×P spatial modes U_(P), P×P singular value matrix S_(p), and n×Ptemporal modes V_(P), and a calculation equation thereof is:${{X^{T}{Xv}_{i}} = {\rho_{i}^{2}v_{i}}}{u_{i} = \frac{{Xv}_{i}}{\rho_{i}}}$where u_(i) is an i-th column of the spatial modes U_(P), v_(i) is ani-th column of the temporal modes V_(P), ρ_(i) is a singular valuecorresponding to u_(i) and v_(i), i=1, . . . , P; and X^(T)X is an n×ntime covariance matrix; S232: using the Laplace smoothing filter tofirst filter columns of the time covariance matrix, and then filter rowsof the time covariance matrix; wherein for the rows or columns of thetime covariance matrix, each is regarded as a time series data setC={c₁, c₂, . . . , c_(n)} comprising data of n moments, and a time setcorresponding to C is T={t₁, t₂, . . . , t_(n)}, c_(l) is time seriesdata of a l-th time t₁, l=1,2, . . . , n, and it is satisfied thatt₁<t₂< . . . <t_(n); and a formula for the Laplace smoothing filter is:${c_{l}^{({s + 1})} = {c_{l}^{(s)} + \frac{w_{l + 1} - w_{l}}{t_{l + 1}^{\prime} - t_{l}^{\prime}}}}{w_{l} = {\alpha\frac{c_{l} - c_{l - 1}}{t_{l} - t_{l - 1}}}}{t_{l}^{\prime} = {t_{l - 1} + t_{l}}}$where c_(l) ^((s)) represents data c_(l) before the Laplace smoothingfilter, c_(l) ^((s+1)) represents data c_(l) after the Laplace smoothingfilter, and α is a smoothing strength, 0<α<1; S233: after completing theLaplace smoothing filter once, calculating reconstructed values of alldata missing points in the matrix X, wherein a reconstructed valueX_(i,j) ^(re) of a missing point located in the i-th row and j-th columnof the matrix X is:$X_{i,j}^{re} = {\left( {U_{P}S_{P}V_{P}^{T}} \right)_{i,j} = {\sum\limits_{p = 1}^{P}{{\rho_{p}\left( u_{p} \right)}_{i}\left( v_{p}^{T} \right)_{j}}}}$where (u_(p))_(i) is an i-th element in a p-th column of the spatialmodes U_(P), (v_(p) ^(T))_(j) is a j-th element in a p-th column of thetemporal modes V_(P), and ρ_(p) is a corresponding singular value; andS234: calculating a root mean square error R between the reconstructedvalue and the true value of all the elements in the cross-validationset:$R = \sqrt{\frac{1}{N}{\sum_{p = 1}^{N}\left( {X_{p}^{re} - X_{p}^{c}} \right)^{2}}}$where N is the total number of data in the cross-validation set; X_(p)^(re) and X_(p) ^(c) are the reconstructed value and true value of thep-th element in the cross-validation set, respectively; wherein if theroot mean square error R is larger than an error threshold, thereconstructed values of all missing points in the matrix X obtained inthe step S233 are updated to the matrix X, then the steps S231 to S233are repeated for the updated matrix X, until the root mean square errorR converges below the threshold, then the repetition stops, and theconverged root mean square error R is recorded as the root mean squareerror R^(P) under the current retained mode number P; and S24: comparingthe root mean square error R^(P) under different retained mode numbersP=1, 2, . . . , k_(MAX), taking the P value corresponding to thesmallest R^(P) as an optimal retained mode number P, to obtain anoptimal temporal modes V _(P) and an optimal spatial modes U _(P) . 6.The reconstruction method according to claim 5, wherein in the step S23,an iteration range of the retained mode number P is [1, k_(MAX)], andthe maximum number of iterations k_(MAX)≤min(m,n).
 7. The reconstructionmethod according to claim 5, wherein in the step S23, if the number ofrepetitions of the steps S231 to S233 for the singular valuedecomposition and the Laplace smoothing filter reaches the maximumnumber of times, the repetition stops, and a final root mean squareerror R is directly recorded as the root mean square error R^(P) underthe current retained mode number P.
 8. The reconstruction methodaccording to claim 1, wherein the step S3 comprises steps of: S31:decomposing each column of characteristic mode vectors in the optimaltemporal modes V _(P) , and when decomposing, decomposing data belongingto same time in different days into one sub-temporal modes with a dailycycle, and the i-th column characteristic mode vector v_(i)(t) isdecomposed into:v _(i)(t)={v _(i1)(d1),v _(i2)(dk), . . . , v _(iK)(dK)} where K is thetotal number of daily data frequencies of the geostationary ocean colorsatellite remote sensing image; v_(ik)(dk) is the sub-temporal modescorresponding to the k-th time in v_(i)(t), and comprises data of thek-th time of all dates in v_(i)(t); i=1,2, . . . , P; k=1,2, . . . , K;dk represents the dates comprised in the data in v_(ik)(dk), and amissing date corresponding to the extremely-high-missing-rate data isnot comprised in dk; S32: performing empirical decomposition for eachsub-temporal modes v_(ik)(dk) in each column of the characteristic modevectors:$v_{ik} = {({dk}) = {{\sum\limits_{j = 1}^{n}{c_{j}({dk})}} + {r_{n}({dk})}}}$where n is the total number of moments represented by the images in theocean color remote sensing data set, and c_(j)(dk) is the j-th inherentmode function component, and r_(n)(dk) is residual of the decomposition;S33: for each sub-temporal modes v_(ik)(dk), using a cubic splineinterpolation method to interpolate and fill all the inherent modefunction components c_(j)(dk) and the data missing date positions in theresidual signal r_(n)(dk), then re-superimposing all the inherent modefunction components and residual signals after the interpolation andfilling, to obtain the sub-temporal modes v_(ik)(Dk) having a completetime series, wherein Dk comprises all consecutive dates having the k-thtime; S34: re-superimposing and merging the K sub-temporal modev_(ik)(Dk) of a same column of the characteristic mode vectors, toobtain the complete characteristic mode vector v_(i)(T) comprising allmoments; and S35: re-superimposing and merging all P columns of thecomplete characteristic mode vectors v_(i)(T), to obtain the completetemporal modes V_(T) having a continuous time series.
 9. Thereconstruction method according to claim 8, wherein the total number Kof the daily data frequencies of the geostationary ocean color satelliteremote sensing image is
 8. 10. The reconstruction method according toclaim 1, wherein the step S4 comprises steps of: S41: using the spatialmodes U _(P) obtained in the step S2 and the complete temporal modesV_(T) obtained in the step S3, to recalculate the reconstructed valuesof all missing points in the matrix X, and the reconstructed valueX_(i,j) ^(re) of the missing point located in the i-th row and j-thcolumn of the matrix X is:$X_{i,j}^{re} = {\left( {U_{\overset{\_}{P}}S_{\overset{\_}{P}}V_{T}^{T}} \right)_{i,j} = {\sum\limits_{p = 1}^{\overset{\_}{P}}{{{\overset{\_}{\rho}}_{p}\left( {\overset{\_}{u}}_{p} \right)}_{i}\left( {\overset{\_}{v}}_{p}^{T} \right)_{j}}}}$where (ū_(p))_(i) is the i-th element in the p-th column of the spatialmodes U _(P) , (v _(p) ^(T))_(j) is the j-th element in the p-th columnof the temporal modes V_(T), p _(p) is a corresponding singular value;and S42: updating the reconstructed values of all missing points in thematrix X obtained in the step S41 to the matrix X, and keeping values ofother elements of the matrix X unchanged, to obtain the time-space fieldof the ocean color remote sensing data set with the missing values beingcomplemented, so as to obtain a geostationary ocean color satellite dataset fully covering time and space.